lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

index.md (2962B)


      1 +++
      2 title = 'Sound processing'
      3 template = 'page-math.html'
      4 +++
      5 # Sound processing
      6 Problems with sound capture:
      7 1. Acquired signals are very noisy
      8 2. Context information is hidden
      9 
     10 How do we process sound to classify and extract information?
     11 
     12 Basic features of sound
     13 
     14 - Volume (air pressure) or loudness (dB) — amplitude of wave
     15 - Frequency (Hz) or pitch — frequency of wave
     16 
     17 ## Periodic signal — Fourier
     18 
     19 When the signal is sinusoidal, it’s simple to calculate the frequency with a physics formula.
     20 
     21 But if it’s not sinusoidal, what do you do? Analyse frequency spectrum. Enter Fourier.
     22 
     23 Fourier: almost every signal can be broken down into multiple sinusoidal waves with different frequencies and amplitudes.
     24 
     25 Instead of having signal amplitude as function of time, represent it by function of frequencies.
     26 
     27 ![screenshot.png](2f8ad86778ebb0b91e9ebc527decb0d4.png)
     28 
     29 Then you end up with a Fourier series — sum of simple sinusoidal waves with frequencies kf₀, amplitudes Ak and phase shifts φk:
     30 
     31 $x(t) = A_{0} + \sum_{k=1}^N A_{k} \sin (2 \pi k f_{0} t + \phi_{k})$
     32 
     33 The periodic signal has a frequency spectrum of various harmonics:
     34 
     35 ![screenshot.png](8ecb6e39f786a6738ceaea52c1640948.png)
     36 
     37 Component frequencies are a multiple of the fundamental frequency, called harmonics.
     38 
     39 You can calculate amplitudes Ak with an algorithm called FFT (Fast Fourier Transform), in a vector.
     40 
     41 You put in the vector of samples and the number of samples N, and you get out a vector of amplitudes of length N+1
     42 
     43 - First element is DC component with frequency 0
     44 - You can really only use the first half of the vector
     45 
     46 Formulas:
     47 
     48 <table>
     49 <tr><td>Frequency step</td>
     50 <td>Frequency at amplitude</td>
     51 <td>Nyquist frequency</td>
     52 <td>Last useful amplitude</td>
     53 </tr>
     54 <tr>
     55 <td>$\Delta f = \frac{F_s}{N}$</td>
     56 <td>$f_{k} = k \Delta f = \frac{kF_{s}}{N}$</td>
     57 <td>$F_{s}/2$</td>
     58 <td>$f_{N/2} = N/2 \Delta f$</td>
     59 </tr>
     60 </table>
     61 
     62 Nyquist frequency (fc): maximum freq. detected using FFT; half sampling rate Fs.
     63 
     64 ## Not periodic — short time analysis
     65 some sound signals are periodic for a very short time
     66 
     67 ![screenshot.png](5a9081f841b448d241811917f4eea3e3.png)![screenshot.png](fb0360fdcbdf2c0fa8c15ce7ddbe6670.png)
     68 
     69 Cut the speech in segments (frames). Then you can apply FFT on those pieces.
     70 This is called segmentation or windowing.
     71 
     72 ### Spectrogram
     73 Freq. spectrum varies in time
     74 
     75 Graph with time on x-axis, frequency on y-axis and colour being amplitude of each frequency
     76 
     77 ![screenshot.png](fe629573739f7ff022dd7c5ae666c281.png)
     78 
     79 ![screenshot.png](e90248e66991c5183a713e851b9fbda8.png)
     80 
     81 ### Digital filtering
     82 Time domain: moving average filter
     83 Frequency domain:
     84 
     85 - Low-pass
     86 - High-pass
     87 - Band-pass — allow only a certain frequency band
     88 - Band-reject (notch-filter) — allow everything but a certain frequency band
     89     - sample signal, compute spectrum using FFT, set to zero portions of spectrum that are just noise, and inverse FFT to synthesise improved signal